One-Shot-Learning Gesture Recognition Using HOG-HOF Features

نویسندگان

  • Jakub Konecný
  • Michal Hagara
چکیده

The purpose of this paper is to describe one-shot-learning gesture recognition systems developed on the ChaLearn Gesture Dataset (ChaLearn). We use RGB and depth images and combine appearance (Histograms of Oriented Gradients) and motion descriptors (Histogram of Optical Flow) for parallel temporal segmentation and recognition. The Quadratic-Chi distance family is used to measure differences between histograms to capture cross-bin relationships. We also propose a new algorithm for trimming videos—to remove all the unimportant frames from videos. We present two methods that use a combination of HOG-HOF descriptors together with variants of a Dynamic Time Warping technique. Both methods outperform other published methods and help narrow the gap between human performance and algorithms on this task. The code is publicly available in the MLOSS repository.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

One-shot-learning Gesture Recognition Using Hog-hof Features Bachelor Thesis One-shot-learning Gesture Recognition Using Hog-hof Features Bachelor Thesis Názov: One-shot-learning Gesture Recognition Using Hog-hof Features

The purpose of this thesis is to describe one-shot-learning gesture recognition systems developed on the ChaLearn Gesture Dataset [3]. We use RGB and depth images and combine appearance (Histograms of Oriented Gradients) and motion descriptors (Histogram of Optical Flow) for parallel temporal segmentation and recognition. The Quadratic-Chi distance family is used to measure differences between ...

متن کامل

One-Shot Learning Gesture Recognition from RGB-D Data Using Bag of Features

For one-shot learning gesture recognition, two important challenges are: how to extract distinctive features and how to learn a discriminative model from only one training sample per gesture class. For feature extraction, a new spatio-temporal feature representation called 3D enhanced motion scale-invariant feature transform (3D EMoSIFT) is proposed, which fuses RGB-D data. Compared with other ...

متن کامل

Action Recognition Using Accelerated Local Descriptors and Temporal Variation

Our system performs late fusion of several features whose weights have been optimized on UCF50 dataset. The fusion is done over the following features: 1) Our newly developed fast local descriptors for HoG and HoF, both grey-scale and RGB. In RGB-HoG/HoF we compute the dense HoG and HoF descriptors for all color channels and concatenate them. To obtain a single vector per video, we use the Fish...

متن کامل

Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition

Noise and constant empirical motion constraints affect the extraction of distinctive spatiotemporal features from one or a few samples per gesture class. To tackle these problems, an adaptive local spatiotemporal feature (ALSTF) using fused RGB-D data is proposed. First, motion regions of interest (MRoIs) are adaptively extracted using grayscale and depth velocity variance information to greatl...

متن کامل

A Human-Centered Approach to One-Shot Gesture Learning

This article discusses the problem of one-shot gesture recognition using a humancentered approach and its potential application to fields such as human–robot interaction where the user’s intentions are indicated through spontaneous gesturing (one shot). Casual users have limited time to learn the gestures interface, which makes one-shot recognition an attractive alternative to interface customi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014